Parsing a Free-Word Order Language: Warlpiri

نویسنده

  • Michael B. Kashket
چکیده

Free-word order languages have long posed significant problems for s tandard parsing algorithms. This paper reports on an implemented parser, based on GovernmentBinding theory (GB) (Chomsky, 1981, 1982), for a particular free-word order language, Warlpiri, an aboriginal language of central Australia. The parser is explicitly designed to t ransparent ly mirror the principles of GB. The operation of this parsing system is quite different in character from that of a rule-based parsing system, ~ e.g., a context-free parsing method. In this system, phrases are constructed via principles of selection, case-marking, caseassignment, and argument-linking, rather than by phrasal rules. The output of the parser for a sample Warlpiri sentence of four words in length is given. The parser was executed on each of the 23 other permutat ions of the sentence, and it output equivalent parses, thereby demonstrating its abili ty to correctly handle the highly scrambled sentences found in Warlpiri. I N T R O D U C T I O N Basing a parser on Government-Binding theory has led to a design that is quite different from tradit ional algori thms. 1 The parser presented here operates in two stages, lexical and syntactic. Each stage is carried out by the same parsing engine. The lexical parser projects each const i tuent lexical item (morpheme) according to information in its associated lexical entries. Lexical parsing is highly data-driven from entries in the lexicon, in keeping with GB. Lexical parses returned by the first stage are then handed over to the second stage, the syntactic parser, as input, where they are further projected and combined to form the final phrase marker. Before plunging into the parser itself, a sample Warlpiri sentence is presented. Following this, the theory of argument (i.e., NP) identification is given, in order to show how its substantive linguistic principles may be used directly in parsing. Both the lexicon and the other basic da ta structures are then discussed, followed by a description of the central algorithm, the parsing engine. Lexical phrase-markers produced by the parser for the words kur1 Johnson (1985} reports another design for analyzing discontinuous constituents; it is not grounded on any linguistic theory, however. duku and puntarni are then given. Finally, the syntactic phrase-marker for the sample sentence is presented. All the phrase-markers shown are slightly edited outputs of the implemented program. A S A M P L E S E N T E N C E In order to make the presentation of the parser a little less abstract , a sample sentence of Warlpiri is shown in (1): (1) Ngajulu-rlu ka-rna-rla punta-rni kurdu-ku karli. I-ERG PRES-1-3 take-NPST child-DAT boomerang 'I am taking the boomerang from the child. ' (The hyphens are introduced for the nonspeaker of Warlpiri in order to clearly delimit the morphemes.) The second word, karnarla, is the auxiliary which must appear in the second (Wackernagel's) position. Except for the auxiliary, the other words may be ut tered in any order; there are 4! ways of saying this sentence. The parser assumes that the input sentence can l~e broken into its constituent words and morphemes. ~ Sentence (1) would be represented as in (2). The parser can not yet handle the auxiliary, so it has been omitted from the input. ((NGAJULU RLU) (PUNTA RNI) (KURDU KU) (KARLI)) A R G U M E N T I D E N T I F I C A T I O N Before presenting the lexicon, GB argument identification as it is construed for the parser is presented? Case is used to identify syntactic arguments and to link them to their syntactic predicates {e.g., verbal, nominal and infinitival). There are three such cases in Warlpiri: ergative, absolutive and dative. Argument identification is effected by four subsystems involving case: selection, case-marking, case-assignment, and argument-linking. Only maximal projections (e.g., NP and VP, in English) are eligible to be arguments. In order ~Barton (1985) has written a morphological analyzer that breaks down Warlpiri words in their constituent morphemes. We have connected both parsers so that the user is able to enter sentences in a less stilted form. Input (2), however, is given directly to the main parser, bypassing Barton's analyzer. ZThis analysis of Warlpiri comes from several sources, and from the helpful assistance of Mary Laughren. See, for example, (Laughren, 1978; Nash, 1980; Hale, 1983).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تأثیر ساخت‌واژه‌ها در تجزیه وابستگی زبان فارسی

Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...

متن کامل

Constituent-Based Morphological Parsing: A New Approach to the Problem of Word-Recognition

We present a model of morphological processing which directly encodes prosodic constituency, a notion which is clearly crucial in many widespread morphological processes. The model has been implemented for the Australian language Warlpiri and has been successfully interfaced with a syntactic parser for that language (Brunson, 1986). We contrast our approach with approaches to morphological pars...

متن کامل

Competition between word order and case-marking in interpreting grammatical relations: a case study in multilingual acquisition.

The study examines strategies multilingual children use to interpret grammatical relations, focusing on their two primary languages, Lajamanu Warlpiri and Light Warlpiri. Both languages use mixed systems for indicating grammatical relations. In both languages ergative-absolutive case-marking indicates core arguments, but to different extents in each language. In Lajamanu Warlpiri, pronominal cl...

متن کامل

Arguments desperately seeking Interpretation: Parsing German Infinitives

In this paper we present a GB-parsing system for German and in particular the system's strategy for argument interpretat ion, which copes with the difficulty that word order is relatively free in German and also that arguments can precede their predicate. In this latter case, the parser makes a provisional interpretation, which is checked when the argument structure of the predicate is availabl...

متن کامل

Parsing German Topological Fields with Probabilistic Context-Free Grammars

Parsing German Topological Fields with Probabilistic Context-Free Grammars Jackie Chi Kit Cheung M. Sc. Graduate Department of Computer Science University of Toronto 2009 Syntactic analysis is useful for many natural language processing applications requiring further semantic analysis. Recent research in statistical parsing has produced a number of highperformance parsers using probabilistic co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1986